The impact of partitioning on phylogenomic accuracy

نویسندگان

  • Diego Darriba
  • David Posada
چکیده

Several strategies have been proposed to assign substitution models in phylogenomic datasets, or partitioning. The accuracy of these methods, and most importantly, their impact on phylogenetic estimation has not been thoroughly assessed using computer simulations. We simulated multiple partitioning scenarios to benchmark two a priori partitioning schemes (one model for the whole alignment, one model for each data block), and two statistical approaches (hierarchical clustering and greedy) implemented in PartitionFinder and in our new program, PartitionTest. Most methods were able to identify optimal partitioning schemes closely related to the true one. Greedy algorithms identified the true partitioning scheme more frequently than the clustering algorithms, but selected slightly less accurate partitioning schemes and tended to underestimate the number of partitions. PartitionTest was several times faster than PartitionFinder, with equal or better accuracy. Importantly, maximum likelihood phylogenetic inference was very robust to the partitioning scheme. Best-fit partitioning schemes resulted in optimal phylogenetic performance, without appreciable differences compared to the use of the true partitioning scheme. However, accurate trees were also obtained by a “simple” strategy consisting of assigning independent GTR+G models to each data block. On the contrary, leaving the data unpartitioned always diminished the quality of the trees inferred, to a greater or lesser extent depending on the simulated scenario. The analysis of empirical data confirmed these trends, although suggesting a stronger influence of the partitioning scheme. Overall, our results suggests that statistical partitioning, but also the a priori assignment of independent GTR+G models, maximize phylogenomic performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

P re p rin t The impact of partitioning on phylogenomic accuracy

Several strategies have been proposed to assign substitution models in phylogenomic datasets, or partitioning. The accuracy of these methods, and most importantly, their impact on phylogenetic estimation has not been thoroughly assessed using computer simulations. We simulated multiple partitioning scenarios to benchmark two a priori partitioning schemes (one model for the whole alignment, one ...

متن کامل

Strategies for Partitioning Clock Models in Phylogenomic Dating: Application to the Angiosperm Evolutionary Timescale

Evolutionary timescales can be inferred from molecular sequence data using a Bayesian phylogenetic approach. In these methods, the molecular clock is often calibrated using fossil data. The uncertainty in these fossil calibrations is important because it determines the limiting posterior distribution for divergence-time estimates as the sequence length tends to infinity. Here, we investigate ho...

متن کامل

Response Surface Methodology for the Evaluation of Lysozyme Partitioning in Poly (Vinyl Pyrrolidone) and Potassium Phosphate Aqueous Two-Phase System

The partitioning of lysozyme and extraction yield in an aqueous two-phase system containing Poly Vinyl Pyrrolidone (PVP) K25 and potassium phosphate were investigated as a function of weight percent of salt and PVP in the feed, temperature, and pH. To investigate partitioning behavior, the central composite design was considered using a quadratic model. According to the results of the model...

متن کامل

The impact of GC bias on phylogenetic accuracy using targeted enrichment phylogenomic data.

The field of sequence based phylogenetic analyses is currently being transformed by novel hybrid-based targeted enrichment methods, such as the use of ultraconserved elements (UCEs). Rather than analyzing relationships among organisms using a small number of genes, these methods now allow us to evaluate relationships with many hundreds to thousands of individual gene loci. However, the inclusio...

متن کامل

Predicting Implantation Outcome of In Vitro Fertilization and Intracytoplasmic Sperm Injection Using Data Mining Techniques

Objective The main purpose of this article is to choose the best predictive model for IVF/ICSI classification and to calculate the probability of IVF/ICSI success for each couple using Artificial intelligence. Also, we aimed to find the most effective factors for prediction of ART success in infertile couples. MaterialsAndMethods In this cross-sectional study, the data of 486 patients are colle...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015